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ABSTRACT 

The recording and sharing of cooking recipes, a human ac- 
tivity dating back thousands of years, naturally became an 
early and prominent social use of the web. The resulting 
online recipe collections are repositories of ingredient com- 
binations and cooking methods whose large-scale and vari- 
ety yield interesting insights about both the fundamentals of 
cooking and user preferences. At the level of an individual 
ingredient we measure whether it tends to be essential or can 
be dropped or added, and whether its quantity can be modi- 
fied. We also construct two types of networks to capture the 
relationships between ingredients. The complement network 
captures which ingredients tend to co-occur frequently, and 
is composed of two large communities: one savory, the other 
sweet. The substitute network, derived from user-generated 
suggestions for modifications, can be decomposed into many 
communities of functionally equivalent ingredients, and cap- 
tures users' preference for healthier variants of a recipe. Our 
experiments reveal that recipe ratings can be well predicted 
with features derived from combinations of ingredient net- 
works and nutrition information. 

Categories and Subject Descriptors 

H. 2.8 [Database Management]: Database applications — 
Data mining 

General Terms 

Measurement; Experimentation 

Keywords 

ingredient networks, recipe recommendation 

I. INTRODUCTION 

The web enables individuals to collaboratively share knowl- 
edge and recipe websites are one of the earliest examples of 
collaborative knowledge sharing on the web. Allrecipes.com, 



the subject of our present study, was founded in 1997, years 
ahead of other collaborative websites such as the Wikipedia. 
Recipe sites thrive because individuals are eager to share 
their recipes, from family recipes that had been passed down 
for generations, to new concoctions that they created that 
afternoon, having been motivated in part by the ability to 
share the result online. Once shared, the recipes are imple- 
mented and evaluated by other users, who supply ratings 
and comments. 

The desire to look up recipes online may at first appear 
odd given that tombs of printed recipes can be found in 
almost every kitchen. The Joy of Cooking [12] alone con- 
tains 4,500 recipes spread over 1,000 pages. There is, how- 
ever, substantial additional value in online recipes, beyond 
their accessibility. While the Joy of Cooking contains a 
single recipe for Swedish meatballs, Allrecipes.com hosts 
"Swedish Meatballs I", "II", and "III", submitted by different 
users, along with 4 other variants, including "The Amaz- 
ing Swedish Meatball". Each variant has been reviewed, 
from 329 reviews for "Swedish Meatballs I" to 5 reviews 
for "Swedish Meatballs III". The reviews not only provide 
a crowd-sourced ranking of the different recipes, but also 
many suggestions on how to modify them, e.g. using ground 
turkey instead of beef, skipping the "cream of wheat" be- 
cause it is rarely on hand, etc. 

The wealth of information captured by online collabora- 
tive recipe sharing sites is revealing not only of the fun- 
damentals of cooking, but also of user preferences. The co- 
occurrence of ingredients in tens of thousands of recipes pro- 
vides information about which ingredients go well together, 
and when a pairing is unusual. Users' reviews provide clues 
as to the flexibility of a recipe, and the ingredients within 
it. Can the amount of cinnamon be doubled? Can the nut- 
meg be omitted? If one is lacking a certain ingredient, can a 
substitute be found among supplies at hand without a trip 
to the grocery store? Unlike cookbooks, which will contain 
vetted but perhaps not the best variants for some individu- 
als' tastes, ratings assigned to user-submitted recipes allow 
for the evaluation of what works and what does not. 

In this paper, we seek to distill the collective knowledge 
and preference about cooking through mining a popular 
recipe-sharing website. To extract such information, we first 
parse the unstructured text of the recipes and the accom- 
panying user reviews. We construct two types of networks 
that reflect different relationships between ingredients, in 
order to capture users' knowledge about how to combine in- 
gredients. The complement network captures which ingre- 
dients tend to co-occur frequently, and is composed of two 



large communities: one savory, the other sweet. The sub- 
stitute network, derived from user-generated suggestions for 
modifications, can be decomposed into many communities of 
functionally equivalent ingredients, and captures users' pref- 
erence for healthier variants of a recipe. Our experiments 
reveal that recipe ratings can be well predicted by features 
derived from combinations of ingredient networks and nu- 
trition information (with accuracy .792), while most of the 
prediction power comes from the ingredient networks (84%). 

The rest of the paper is organized as follows. Section [2] re- 
views the related work. Section[3]describes the dataset. Sec- 
tion [4] discusses the extraction of the ingredient and comple- 
ment networks and their characteristics. Section [5] presents 
the extraction of recipe modification information, as well as 
the construction and characteristics of the ingredient substi- 
tute network. Section [6] presents our experiments on recipe 
recommendation and Section [7] concludes. 

2. RELATED WORK 

Recipe recommendation has been the subject of much 
prior work. Typically the goal has been to suggest recipes 
to users based on their past recipe ratings [15] [3] or brows- 
ing/cooking history [l6]. The algorithms then find simi- 
lar recipes based on overlapping ingredients, either treat- 
ing each ingredient equally [2] or by identifying key ingre- 
dients 
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Instead of modeling recipes using ingredients, 
Wang et al. [17] represent the recipes as graphs which are 
built on ingredients and cooking directions, and they demon- 
strate that graph representations can be used to easily ag- 
gregate Chinese dishes by the flow of cooking steps and the 
sequence of added ingredients. However, their approach only 
models the occurrence of ingredients or cooking methods, 
and doesn't take into account the relationships between in- 
gredients. In contrast, in this paper we incorporate the like- 
lihood of ingredients to co-occur, as well as the potential of 
one ingredient to act as a substitute for another. 

Another branch of research has focused on recommend- 
ing recipes based on desired nutritional intake or promoting 
healthy food choices. Geleijnse et al. [7] designed a proto- 
type of a personalized recipe advice system, which suggests 
recipes to users based on their past food selections and nutri- 
tion intake. In addition to nutrition information, Kamieth 
et al. [9] built a personalized recipe recommendation system 
based on availability of ingredients and personal nutritional 
needs. Shidochi et al. [l4] proposed an algorithm to extract 
replaceable ingredients from recipes in order to satisfy users' 
various demands, such as calorie constraints and food avail- 
ability. Their method identifies substitutable ingredients by 
matching the cooking actions that correspond to ingredient 
names. However, their assumption that substitutable ingre- 
dients are subject to the same processing methods is less di- 
rect and specific than extracting substitutions directly from 
user-contributed suggestions. 

Ahn et al. [I] and Kinouchi et al [TO] examined networks 
involving ingredients derived from recipes, with the former 
modeling ingredients by their flavor bonds, and the latter 
examining the relationship between ingredients and recipes. 
In contrast, we derive direct ingredient-ingredient networks 
of both compliments and substitutes. We also step beyond 
characterizing these networks to demonstrating that they 
can be used to predict which recipes will be successful. 



3. DATASET 

Allrecipes.com is one of the most popular recipe-sharing 
websites, where novice and expert cooks alike can upload 
and rate cooking recipes. It hosts 16 customized interna- 
tional sites for users to share their recipes in their native 
languages, of which we study only the main, English, ver- 
sion. Recipes uploaded to the site contain specific instruc- 
tions on how to prepare a dish: the list of ingredients, prepa- 
ration steps, preparation and cook time, the number of serv- 
ings produced, nutrition information, serving directions, and 
photos of the prepared dish. The uploaded recipes are en- 
riched with user ratings and reviews, which comment on 
the quality of the recipe, and suggest changes and improve- 
ments. In addition to rating and commenting on recipes, 
users are able to save them as favorites or recommend them 
to others through a forum. 

We downloaded 46,337 recipes including all information 
listed from allrecipes.com, including several classifications, 
such as a region (e.g. the midwest region of US or Eu- 
rope), the course or meal the dish is appropriate for (e.g.: 
appetizers or breakfast), and any holidays the dish may be 
associated with. In order to understand users' recipe prefer- 
ences, we crawled 1,976,920 reviews which include reviewers' 
ratings, review text, and the number of users who voted the 
review as useful. 

3.1 Data preprocessing 

The first step in processing the recipes is identifying the 
ingredients and cooking methods from the freeform text of 
the recipe. Usually, although not always, each ingredient 
is listed on a separate line. To extract the ingredients, we 
tried two approaches. In the first, we found the maximal 
match between a pre-curated list of ingredients and the text 
of the line. However, this missed too many ingredients, 
while misidentifying others. In the second approach, we 
used regular expression matching to remove non-ingredient 
terms from the line and identified the remainder as the in- 
gredient. We removed quantifiers, such as e.g. "1 lb" or "2 
cups", words referring to consistency or temperature, e.g. 
chopped or cold, along with a few other heuristics, such as 
removing content in parentheses. For example "1 (28 ounce) 
can baked beans (such as Bush's Original®)" is identified 
as "baked beans". By limiting the list of potential terms 
to remove from an ingredient entry, we erred on the side 
of not conflating potentially identical or highly similar in- 
gredients, e.g. "cheddar cheese", used in 2450 recipes, was 
considered different from "sharp cheddar cheese", occurring 
in 394 recipes. 

We then generated an ingredient list sorted by frequency 
of ingredient occurrence and selected the top 1000 common 
ingredient names as our finalized ingredient list. Each of the 
top 1000 ingredients occurred in 23 or more recipes, with 
plain salt making an appearance in 47.3% of recipes. These 
ingredients also accounted for 94.9% of ingredient entries in 
the recipe dataset. The remaining ingredients were missed 
either because of high specificity (e.g. yolk-free egg noodle), 
referencing brand names (e.g. Planters almonds), rarity (e.g. 
serviceberry), misspellings, or not being a food (e.g. "nylon 
netting"). 

The remaining processing task was to identify cooking 
processes from the directions. We first identified all heating 
methods using a listing in the Wikipedia entry on cooking 
[181 . For example, baking, boiling, and steaming are all ways 
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Figure 1: The percentage of recipes by region that 
apply a specific heating method. 

of heating the food. We then identified mechanical ways of 
processing the food such as chopping and grinding, and other 
chemical techniques such as marinating and brining. 

3.2 Regional preferences 

Choosing one cooking method over another appears to be 
a question of regional taste. 5.8% of recipes were classified 
into one of five US regions: Mountain, Midwest, Northeast, 
South, and West Coast (including Alaska and Hawaii). Fig- 
ure [l] shows significantly (x 2 test p- value < 0.001) varying 
preferences in the different US regions among 6 of the most 
popular cooking methods. Boiling and simmering, both in- 
volving heating food in hot liquids, are more common in the 
South and Midwest. Marinating and grilling are relatively 
more popular in the West and Mountain regions, but in the 
West more grilling recipes involve seafood (18/42 = 42%) 
relative to other regions combined (7/106 = 6%). Frying 
is popular in the South and Northeast. Baking is a univer- 
sally popular and versatile technique, which is often used for 
both sweet and savory dishes, and is slightly more popular 
in the Northeast and Midwest. Examination of individual 
recipes reflecting these frequencies shows that these differ- 
ences in preference can be tied to differences in demograph- 
ics, immigrant culture and availability of local ingredients, 
e.g. seafood. 

4. INGREDIENT COMPLEMENT NETWORK 

Can we learn how to combine ingredients from the data? 
Here we employ the occurrences of ingredients across recipes 
to distill users' knowledge about combining ingredients. 

We constructed an ingredient complement network based 
on pointwise mutual information (PMI) defined on pairs of 
ingredients (a, b): 



PMI(a,b) = , 



p(a,b) 



where 



p{a, b) 



p(a)p(b) 
# of recipes containing a and b 



p{a) = 
p{b) = 



# of recipes 

# of recipes containing a 

# of recipes 

# of recipes containing 6 

# of recipes 



The PMI gives the probability that two ingredients occur 
together against the probability that they occur separately. 
Complementary ingredients tend to occur together far more 
often than would be expected by chance. 

Figure [2] shows a visualization of ingredient complemen- 
tarity. Two distinct subcommunities of recipes are imme- 
diately apparent: one corresponding to savory dishes, the 
other to sweet ones. Some central ingredients, e.g. egg and 
salt, actually are pushed to the periphery of the network. 
They are so ubiquitous, that although they have many edges, 
they are all weak, since they don't show particular comple- 
mentarity with any single group of ingredients. 

We further probed the structure of the complementarity 
network by applying a network clustering algorithm [13] . 
The algorithm confirmed the existence of two main clusters 
containing the vast majority of the ingredients. An interest- 
ing satellite cluster is that of mixed drink ingredients, which 
is evident as a constellation of small nodes located near the 
top of the sweet cluster in Figure [2] The cluster includes 
the following ingredients: lime, rum, ice, orange, pineapple 
juice, vodka, cranberry juice, lemonade, tequila, etc. 

For each recipe we recorded the minimum, average, and 
maximum pairwise pointwise mutual information between 
ingredients. The intuition is that complementary ingredi- 
ents would yield higher ratings, while ingredients that don't 
go together would lower the average rating. We found that 
while the average and minimum pointwise mutual informa- 
tion between ingredients is uncorrelated with ratings, the 
maximum is very slightly positively correlated with the av- 
erage rating for the recipe (p = 0.09, p-value < 10 -10 ). This 
suggests that having at least two complementary ingredients 
very slightly boosts a recipe's prospects, but having clashing 
or unrelated ingredients does not seem to do harm. 

5. RECIPE MODIFICATIONS 

Co-occurrence of ingredients aggregated over individual 
recipes reveals the structure of cooking, but tells us little 
about how flexible the ingredient proportions are, or whether 
some ingredients could easily be left out or substituted. An 
experienced cook may know that apple sauce is a low-fat al- 
ternative to oil, or may know that nutmeg is often optional, 
but a novice cook may implement recipes literally, afraid 
that deviating from the instructions may produce poor re- 
sults. While a traditional hardcopy cookbook would provide 
few such hints, they are plentiful in the reviews submitted 
by users who implemented the recipes, e.g. "This is a great 
recipe, but using fresh tomatoes only adds a few minutes to 
the prep time and makes it taste so much better", or another 
comment about the same salsa recipe "This is by far the best 
recipe we have ever come across. We did however change it 
just a little bit by adding extra onion. " 

As the examples illustrate, modifications are reported even 
when the user likes the recipe. In fact, we found that 60.1% 
of recipe reviews contain words signaling modification, such 
as "add", "omit", "instead", "extra" and 14 others. Further- 
more, it is the reviews that include changes that have a sta- 
tistically higher average rating (4.49 vs. 4.39, t-test p-value 
< 10~ 10 ), and lower rating variance (0.82 vs. 1.05, Bartlett 
test p-value < 10~ 10 ), as is evident in the distribution of 
ratings, shown in Fig. |3] This suggests that flexibility in 
recipes is not necessarily a bad thing, and that reviewers 
who don't mention modifications are more likely to think of 
the recipe as perfect, or to dislike it entirely. 
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Figure 2: Ingredient complement network. Two ingredients share an edge if they occur together more than 
would be expected by chance and if their pointwise mutual information exceeds a threshold. 




1 2 3 4 5 



rating 

Figure 3: The likelihood that a review suggests a 
modification to the recipe depends on the star rating 
the review is assigning to the recipe. 

In the following, we describe the recipe modifications ex- 
tracted from user reviews, including adjustment, deletion 
and addition. We then present how we constructed an in- 
gredient substitute network based on the extracted informa- 
tion. 

5.1 Adjustments 

Some modifications involve increasing or decreasing the 
amount of an ingredient in the recipe. In this and the fol- 
lowing analyses, we split the review on punctuation such 
as commas and periods. We used simple heuristics to de- 
tect when a review suggested a modification: adding/using 
more/less of an ingredient counted as an increase/decrease. 
Doubling or increasing counted as an increase, while reduc- 
ing, cutting, or decreasing counted as a decrease. While it is 
likely that there are other expressions signaling the adjust- 
ment of ingredient quantities, using this set of terms allowed 



us to compare the relative rate of modification, as well as 
the frequency of increase vs. decrease between ingredients. 
The ingredients themselves were extracted by performing a 
maximal character match within a window following an ad- 
justment term. 

Figure [4] shows the ratios of the number of reviews sug- 
gesting modifications, either increases or decreases, to the 
number of recipes that contain the ingredient. Two patterns 
are immediately apparent. Ingredients that may be per- 
ceived as being unhealthy, such as fats and sugars, are, with 
the exception of vegetable oil and margarine, more likely 
to be modified, and to be decreased. On the other hand, 
flavor enhancers such as soy sauce, lemon juice, cinnamon, 
Worcestershire sauce, and toppings such as cheeses, bacon 
and mushrooms, are also likely to be modified; however, they 
tend to be added in greater, rather than lesser quantities. 
Combined, the patterns suggest that good-tasting but "un- 
healthy" ingredients can be reduced, if desired, while spices, 
extracts, and toppings can be increased to taste. 

5.2 Deletions and additions 

Recipes are also frequently modified such that ingredients 
are omitted entirely. We looked for words indicating that 
the reviewer did not have an ingredient (and hence did not 
use it), e.g. "had no" and "didn't have". We further used 
"omit/left out/left off/bother with" as indication that the 
reviewer had omitted the ingredients, potentially for other 
reasons. Because reviewers often used simplified terms, e.g. 
"vanilla" instead of "vanilla extract", we compared words in 
proximity to the action words by constructing 4-character- 
grams and calculating the cosine similarity between the n- 
grams in the review and the list of ingredients for the recipe. 

To identify additions, we simply looked for the word "add", 
but omitted possible substitutions. For example, we would 
use "added cucumber", but not "added cucumber instead of 
green pepper", the latter of which we analyze in the follow- 
ing section. We then compared the addition to the list of 
ingredients in the recipes, and considered the addition valid 
only if the ingredient does not already belong in the recipe. 
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Figure 4: Suggested modifications of quantity for 
the 50 most common ingredients, derived from 
recipe reviews. The line denotes equal numbers of 
suggested quantity increases and decreases. 

Table [l] shows the correlation between ingredient modifi- 
cations. As might be expected, the more frequently an in- 
gredient occurs in a recipe, the more times its quantity has 
the opportunity to be modified, as is evident in the strong 
correlation between the the number of recipes the ingredient 
occurs in and both increases and decreases recommended in 
reviews. However, the more common an ingredient, the more 
stable it appears to be. Recipe frequency is negatively cor- 
related with deletions/recipe (p = —0.22), additions/recipe 
(p = —0.25), and increases/recipe (p = —0.26). For exam- 
ple, salt is so essential, appearing in over 21,000 recipes, that 
we detected only 18 reviews where it was explicitly dropped. 
In contrast, Worcheshire sauce, appearing in 1,542 recipes, 
is dropped explicitly in 148 reviews. 

As might also be expected, additions are positively corre- 
lated with increases, and deletions with decreases. However, 
additions and deletions are very weakly negatively corre- 
lated, indicating that an ingredient that is added frequently 
is not necessarily omitted more frequently as well. 



Table 1: Correlations between ingredient modifica- 
tions 
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5.3 Ingredient substitute network 

Replacement relationships show whether one ingredient 
is preferable to another. The preference could be based 
on taste, availability, or price. Some ingredient substitu- 
tion tables can be found onlin^] but are neither extensive 
nor contain information about relative frequencies of each 

1 e.g., http://allrccipes.com/HowTo/common-ingredient- 
|substit utions / d etail. aspx| 
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Figure 5: Ingredient substitute network. Nodes are 
sized according to the number of times they have 
been recommended as a substitute for another in- 
gredient, and colored according to their indegree. 



substitution. Thus, we found an alternative source for ex- 
tracting replacement relationships - users' comments, e.g. 
"/ replaced the butter in the frosting by sour cream, just to 
soothe my conscience about all the fatty calories". 

To extract such knowledge, we first parsed the reviews 
as follows: we considered several phrases to signal replace- 
ment relationships: "replace a with &", "substitute b for a", 
"6 instead of a", etc, and matched a and b to our list of 
ingredients. 

We constructed an ingredient substitute network to cap- 
ture users' knowledge about ingredient replacement. This 
weighted, directed network consists of ingredients as nodes. 
We thresholded and eliminated any suggested substitutions 
that occurred fewer than 5 times. We then determined the 
weight of each edge by p(b\a), the proportion of substitu- 
tions of ingredient a that suggest ingredient b. For example, 
68% of substitutions for white sugar were to splenda, an 
artificial sweetener, and hence the assigned weight for the 
sugar — > splenda edge is 0.68. 

The resulting substitution network, shown in Figure [5] 
exhibits strong clustering. We examined this structure by 
applying the map generator tool by Rosvall et al. [13], which 
uses a random walk approach to identify clusters in weighted, 
directed networks. The resulting clusters, and their relation- 
ships to one another, are shown in Fig. [6] The derived clus- 
ters could be used when following a relatively new recipe 
which may not receive many reviews, and therefore many 
suggestions for ingredient substitutions. If one does not have 
all ingredients at hand, one could examine the content of 
one's fridge and pantry and match it with other ingredients 
found in the same cluster as the ingredient called for by 
the recipe. Table [2] lists the contents of a few such sample 
ingredient clusters, and Fig. [7] shows two example clusters 
extracted from the substitute network. 



Table 2: Clusters of ingredients that can be substi- 
tuted for one another. A maximum of 5 additional 
ingredients for each cluster are listed, ordered by 
PageRank. 



main 


other ingredients 


chicken 


turkey, beef, sausage, chicken breast, bacon 


olive oil 


butter, apple sauce, oil, banana, margarine 


sweet 


yam, potato, pumpkin, butternut squash, 


potato 


parsnip 


baking 


baking soda, cream of tartar 


powder 




almond 


pecan, walnut, cashew, peanut, sunflower s. 


apple 


peach, pineapple, pear, mango, pie filling 


egg 


egg white, egg substitute, egg yolk 


tilapia 


cod, catfish, flounder, halibut, orange roughy 


spinach 


mushroom, broccoli, kale, carrot, zucchini 


italian 


basil, cilantro, oregano, parsley dill 


seasoning 




cabbage 


coleslaw mix, sauerkraut, bok choy 
napa cabbage 



Finally, we examine whether the substitution network en- 
codes preferences for one ingredient over another, as evi- 
denced by the relative ratings of similar recipes, one which 
contains an original ingredient, and another which imple- 
ments a substitution. To test this hypothesis, we construct 
a "preference network", where one ingredient is preferred to 
another in terms of received ratings, and is constructed by 
creating an edge (a, b) between a pair of ingredients, where a 
and b are listed in two recipes X and Y respectively, if recipe 
ratings Rx > Ry- For example, if recipe X includes beef, 
ketchup and cheese, and recipe Y contains beef and pick- 
les, then this recipe pair contributes to two edges: one from 
pickles to ketchup, and the other from pickles to cheese. The 
aggregate edge weights are defined based on PMI. Because 
PMI is a symmetric quantity (PMI(a; 6) = PMI(6;a)), we 
introduce a directed PMI measure to cope with the direc- 
tionality of the preference network: 



PMI(a -> b) = log 



p(a — > b) 
p(a)p(b) ' 



where 



p(a — > b) 



# of recipe pairs from a to 6 
# of recipe pairs 



and p(a), p(b) are defined as in the previous section. 

We find high correlation between this preference network 
and the substitution network (p = 0.72, p < 0.001). This ob- 
servation suggests that the substitute network encodes users' 
ingredient preference, which we use in the recipe prediction 
task described in the next section. 

6. RECIPE RECOMMENDATION 

We use the above insights to uncover novel recommen- 
dation algorithms suitable for recipe recommendations. We 
use ingredients and the relationships encoded between them 
in ingredient networks as our main feature sets to predict 
recipe ratings, and compare them against features encod- 
ing nutrition information, as well as other baseline features 
such as cooking methods, and preparation and cook time. 
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Figure 6: Ingredient substitution clusters. Nodes 
represent clusters and edges indicate the presence of 
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cluster represents a set of related ingredients which 
are frequently substituted for one another. 
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Figure 7: Relationships between ingredients located 
within two of the clusters from Fig. [6j 

Then we apply a discriminative machine learning method, 
stochastic gradient boosting trees [6], to predict recipe rat- 
ings. 

In the experiments, we seek to answer the following three 
questions. (1) Can we predict users' preference for a new 
recipe given the information present in the recipe? (2) What 
are the key aspects that determine users' preference? (3) 
Does the structure of ingredient networks help in recipe rec- 
ommendation, and how? 

6.1 Recipe Pair Prediction 

The goal of our prediction task is: given a pair of similar 
recipes, determine which one has higher average rating than 
the other. This task is designed particularly to help users 
with a specific dish or meal in mind, and who are trying to 
decide between several recipe options for that dish. 

Recipe pair data. The data for this prediction task 
consists of pairs of similar recipes. The reason for select- 
ing similar recipes, with high ingredient overlap, is that 
while apples may be quite comparable to oranges in the 
context of recipes, especially if one is evaluating salads or 
desserts, lasagna may not be comparable to a mixed drink. 
To derive pairs of related recipes, we computed similarity 



with a cosine similarity between the ingredient lists for the 
two recipes, weighted by the inverse document frequency, 
log(=fr of recipes /# of recipes containing the ingredient) . 
We considered only those pairs of recipes whose cosine sim- 
ilarity exceeded 0.2. The weighting is intended to identify 
higher similarity among recipes sharing more distinguishing 
ingredients, such as Brussels sprouts, as opposed to recipes 
sharing very common ones, such as butter. 

A further challenge to obtaining reliable relative rankings 
of recipes is variance introduced by having different users 
choose to rate different recipes. In addition, some users 
might not have a sufficient number of reviews under their 
belt to have calibrated their own rating scheme. To con- 
trol for variation introduced by users, we examined recipe 
pairs where the same users are rating both recipes and are 
collectively expressing a preference for one recipe over an- 
other. Specifically, we generated 62,031 recipe pairs (a, b) 
where ratingi(a) > ratingi(b), for at least 10 users i, and 
over 50% of users who rated both recipe a and recipe b. Fur- 
thermore, each user i should be an active enough reviewer 
to have rated at least 8 other recipes. 

Features. In the prediction dataset, each observation 
consists of a set of predictor variables or features that rep- 
resent information about two recipes, and the response vari- 
able is a binary indicator of which gets the higher rating on 
average. To study the key aspects of recipe information, we 
constructed different set of features, including: 

• Baseline: This includes cooking methods, such as chop- 
ping, marinating, or grilling, and cooking effort de- 
scriptors, such as preparation time in minutes, as well 
as the number of servings produced, etc. These fea- 
tures are considered as primary information about a 
recipe and will be included in all other feature sets 
described below. 

• Full ingredients: We selected up to 1000 popular ingre- 
dients to build a "full ingredient list". In this feature 
set, each observed recipe pair contains a vector with 
entries indicating whether an ingredient from the full 
list is present in either recipe in the pair. 

• Nutrition: This feature set does not include any in- 
gredients but only nutrition information such the total 
caloric content, as well as quantities of fats, carbohy- 
drates, etc. 

• Ingredient networks: In this set, we replaced the full 
ingredient list by structural information extracted from 
different ingredient networks, as described in Sections[4] 
and |5.3| Co-occurrence is treated separately as a raw 
count, and a complementarity, captured by the PMI. 

• Combined set: Finally, a combined feature set is con- 
structed to test the performance of a combination of 
features, including baseline, nutrition and ingredient 
networks. 

To build the ingredient network feature set, we extracted 
the following two types of structural information from the 
co-occurrence and substitution networks, as well as the com- 
plement network derived from the co-occurrence informa- 
tion: 

Network positions are calculated to represent how a recipe's 
ingredients occupy positions within the networks. Such po- 
sition measures are likely to inform if a recipe contains any 
"popular" or "unusual" ingredients. To calculate the posi- 
tion measures, we first calculated various network centrality 
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Figure 8: Prediction performance. The nutrition 
information and ingredient networks are more effec- 
tive features than full ingredients. The ingredient 
network features lead to impressive performance, 
close to the best performance. 

measures, including degree centrality, betweenness central- 
ity, etc., from the ingredient networks. A centrality measure 
can be represented as a vector g where each entry indicates 
the centrality of an ingredient. The network position of a 
recipe, with its full ingredient list represented as a binary 
vector /, can be summarized by g T ■ f, i.e., an aggregated 
centrality measure based on the centrality of its ingredients. 

Network communities provide information about which 
ingredient is more likely to co-occur with a group of other 
ingredients in the network. A recipe consisting of ingredients 
that are frequently used with, complemented by or substi- 
tuted by certain groups may be predictive of the ratings 
the recipe will receive. To obtain the network community 
information, we applied latent semantic analysis (LSA) on 
recipes. We first factorized each ingredient network, rep- 
resented by matrix W, using singular value decomposition 
(SVD). In the matrix W, each entry Wij indicates whether 
ingredient i co-occurrs, complements or substitues ingredi- 
ent j. 

Suppose Wk = Uk^kVif is a rank-fe approximation of W, 
we can then transform each recipe's full ingredient list using 
the low-dimensional representation, Y}^ 1 V k T f, as community 
information within a network. These low-dimensional vec- 
tors, together with the vectors of network positions, consti- 
tute the ingredient network features. 

Learning method. We applied discriminative machine 
learning methods such as support vector machines (SVM) [5] 
and stochastic gradient boosting trees [H] to our prediction 
problem. Here we report and discuss the detailed results 
based on the gradient boosting tree model. Like SVM, the 
gradient boosting tree model seeks a parameterized classi- 
fier, but unlike SVM that considers all the features at one 
time, the boosting tree model considers a set of features 
at a time and iteratively combines them according to their 
empirical errors. In practice, it not only has competitive 
performance comparable to SVM, but can serve as a feature 
ranking procedure [11] . 

In this work, we fitted a stochastic gradient boosting tree 
model with 8 terminal nodes under an exponential loss func- 
tion. The dataset is roughly balanced in terms of which 
recipe is the higher-rated one within a pair. We randomly 
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Figure 9: Relative importance of features in the 
combined set. The individual items from nutri- 
tion information are very indicative in differentiat- 
ing highly rated recipes, while most of the prediction 
power comes from ingredient networks. 
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Figure 10: Relative importance of features repre- 
senting the network structure. The substitution net- 
work has the strongest contribution (39.8%) to the 
total importance of network features, and it also has 
more influential features in the top 100 list, which 
suggests that the substitution network is comple- 
mentary to other features. 

divided the dataset into a training set (2/3) and a testing 
set (1/3). The prediction performance is evaluated based on 
accuracy, and the feature performance is evaluated in terms 
of relative importance 8 . For each single decision tree, one 
of the input variables, x 3 , is used to partition the region as- 
sociated with that node into two subregions in order to fit 
to the response values. The squared relative importance of 
variable x 3 is the sum of such squared improvements over 
all internal nodes for which it was chosen as the splitting 
variable, as: 

imp(j) — i\ /(splits on x 3 ) 

k 

where i\ is the empirical improvement by the fc-th node 
splitting on x 3 at that point. 

6.2 Results 

The overall prediction performance is shown in Fig. [8] 
Surprisingly, even with a full list of ingredients, the pre- 
diction accuracy is only improved from .712 (baseline) to 
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Figure 11: Relative importance of features from 
nutrition information. The carbs item is the most 
influential feature in predicting higher-rated recipes. 



.746. In contrast, the nutrition information and ingredient 
networks are more effective (with accuracy .753 and .786, re- 
spectively) . Both of them have much lower dimensions (from 
tens to several hundreds), compared with the full ingredients 
that are represented by more than 2000 dimensions (1000 
ingredients per recipe in the pair). The ingredient network 
features lead to impressive performance, close to the best 
performance given by the combined set (.792), indicating 
the power of network structures in recipe recommendation. 

Figure [9] shows the influence of different features in the 
combined feature set. Up to 100 features with the highest 
relative importance are shown. The importance of a feature 
group is summarized by how much the total importance is 
contributed by all features in the set. For example, the 
baseline consisting of cooking effort and cooking methods 
contribute 8.9% to the overall performance. The individual 
items from nutrition information are very indicative in differ- 
entiating highly-rated recipes, while most of the prediction 
power comes from ingredient networks (84%). 

Figure [10] shows the top 100 features from the three net- 
works. In terms of the total importance of ingredient net- 
work features, the substitution network has slightly stronger 
contribution (39.8%) than the other two networks, and it 
also has more influential features in the top 100 list. This 
suggests that the structural information extracted from the 
substitution network is not only important but also comple- 
mentary to information from other aspects. 

Looking into the nutrition information (Fig. |ll[ ), we found 
that carbohydrates are the most influential feature in pre- 
dicting higher-rated recipes. Since carbohydrates comprise 
around 50% or more of total calories, the high importance 
of this feature interestingly suggests that a recipe's rating 
can be influenced by users' concerns about nutrition and 
diet. Another interesting observation is that, while individ- 
ual nutrition items are powerful predictors, a higher predic- 
tion accuracy can be reached by using ingredient networks 
alone, as shown in Fig. [8] This implies the information 
about nutrition may have been encoded in the ingredient 
network structure, e.g. substitutions of less healthful ingre- 
dients with "healthier" alternatives. 

Constructing the ingredient network feature involves re- 
ducing high-dimensional network information through SVD, 
as described in the previous section. The dimensionality can 
be determined by cross-validation. As shown in Fig. |12| fea- 
tures with a very large dimension tend to overfit the training 
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Figure 12: Prediction performance over reduced 
dimensionality. The best performance is given by 
reduced dimension k = 50 when combining all three 
networks. In addition, using the information about 
the complement network alone is more effective in 
prediction than using other two networks. 
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Figure 13: Influential substitution communities. 
The matrix shows the most influential feature di- 
mensions extracted from the substitution network. 
For each dimension, the six representative ingredi- 
ents with the highest intensity values are shown, 
with colors indicating their intensity. These features 
suggest that the communities of ingredient substi- 
tutes, such as the sweet and oil in the first dimen- 
sion, are particularly informative in prediction. 

data. Hence we chose k = 50 for the reduced dimension of 
all three networks. The figure also shows that using the 
information about the complement network alone is more 
effective in prediction than using either the co-occurrence 
and substitute networks, even in the case of low dimen- 
sions. Consistently, as shown in terms of relative importance 
(Fig. |10[ ), the substitution network alone is not the most ef- 
fective, but it provides more complementary information in 
the combined feature set. 



In Figure [13] we show the most representative ingredients 
in the decomposed matrix derived from the substitution net- 
work. We display the top five influential dimensions, eval- 
uated based on the relative importance, from the SVD re- 
sultant matrix 14, and in each of these dimensions we ex- 
tracted six representative ingredients based on their inten- 
sities in the dimension (the squared entry values). These 
representative ingredients suggest that the communities of 
ingredient substitutes, such as the sweet and oil substitutes 
in the first dimension or the milk substitutes in the second 
dimesion (which is similar to the cluster shown in Fig. ISp , 
are particularly informative in predicting recipe ratings. 

To summarize our observations, we find we are able to 
effectively predict users' preference for a recipe, but the pre- 
diction is not through using a full list of ingredients. Instead, 
by using the structural information extracted from the re- 
lationships among ingredients, we can better uncover users' 
preference about recipes. 

7. CONCLUSION 

Recipes are little more than instructions for combining 
and processing sets of ingredients. Individual cookbooks, 
even the most expansive ones, contain single recipes for each 
dish. The web, however, permits collaborative recipe gen- 
eration and modification, with tens of thousands of recipes 
contributed in individual websites. We have shown how this 
data can be used to glean insights about regional preferences 
and modifiability of individual ingredients, and also how it 
can be used to construct two kinds of networks, one of in- 
gredient complements, the other of ingredient substitutes. 
These networks encode which ingredients go well together, 
and which can be substituted to obtain superior results, and 
permit one to predict, given a pair of related recipes, which 
one will be more highly rated by users. 

In future work, we plan to extend ingredient networks to 
incorporate the cooking methods as well. It would also be 
of interest to generate region-specific and diet-specific rat- 
ings, depending on the users' background and preferences. 
A whole host of user-interface features could be added for 
users who are interacting with recipes, whether the recipe 
is newly submitted, and hence unrated, or whether they are 
browsing a cookbook. In addition to automatically predict- 
ing a rating for the recipe, one could flag ingredients that 
can be omitted, ones whose quantity could be tweaked, as 
well as suggested additions and substitutions. 
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